{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "02bcd97d",
   "metadata": {},
   "source": [
    "# python网络爬虫应用实战（第20期）第5课书面作业\n",
    "学号：115799\n",
    "\n",
    "**作业内容：**  \n",
    "1. 用抓包工具捕捉登陆炼数成金时发送的请求与cookies，截图说明\n",
    "\n",
    "2. 尝试通过MechanizeSoup 登陆豆瓣（或其他不需要验证码的网站），爬取一个需要登陆后才能查看的元素。截图说明"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f927f3c3",
   "metadata": {},
   "source": [
    "## 第1题"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bcaa663d",
   "metadata": {},
   "source": [
    "截图如下，用Fiddler截获cookies信息。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "362e15d2",
   "metadata": {},
   "source": [
    "[![jFUkR0.png](https://s1.ax1x.com/2022/06/24/jFUkR0.png)](https://imgtu.com/i/jFUkR0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db0e0697",
   "metadata": {},
   "source": [
    "## 第2题"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1783a303",
   "metadata": {},
   "source": [
    "抓取http://quotes.toscrape.com/login 网页，该网页形如：  \n",
    "[![jAPFv8.png](https://s1.ax1x.com/2022/06/26/jAPFv8.png)](https://imgtu.com/i/jAPFv8)\n",
    "代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "afa58827",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Response [200]>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import mechanicalsoup\n",
    "\n",
    "header = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.124 Safari/537.36 Edg/102.0.1245.44'\n",
    "browser = mechanicalsoup.StatefulBrowser(user_agent=header)\n",
    "browser.open(\"http://quotes.toscrape.com/login\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "92cf2533",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<input name=\"csrf_token\" type=\"hidden\" value=\"hWQtRXBbSVcOxyksZoMlPHpjwqYUnKriCaLENTdgAImvJFuzGDfe\"/>\n",
      "<input class=\"form-control\" id=\"username\" name=\"username\" type=\"text\"/>\n",
      "<input class=\"form-control\" id=\"password\" name=\"password\" type=\"password\"/>\n",
      "<input class=\"btn btn-primary\" type=\"submit\" value=\"Login\"/>\n"
     ]
    }
   ],
   "source": [
    "form = browser.select_form()\n",
    "form.print_summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "83cf78b2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<!DOCTYPE html>\n",
      "<html lang=\"en\">\n",
      "<head>\n",
      "\t<meta charset=\"UTF-8\">\n",
      "\t<title>Quotes to Scrape</title>\n",
      "    <link rel=\"stylesheet\" href=\"/static/bootstrap.min.css\">\n",
      "    <link rel=\"stylesheet\" href=\"/static/main.css\">\n",
      "</head>\n",
      "<body>\n",
      "    <div class=\"container\">\n",
      "        <div class=\"row header-box\">\n",
      "            <div class=\"col-md-8\">\n",
      "                <h1>\n",
      "                    <a href=\"/\" style=\"text-decoration: none\">Quotes to Scrape</a>\n",
      "                </h1>\n",
      "            </div>\n",
      "            <div class=\"col-md-4\">\n",
      "                <p>\n",
      "                \n",
      "                    <a href=\"/logout\">Logout</a>\n",
      "                \n",
      "                </p>\n",
      "            </div>\n",
      "        </div>\n",
      "    \n",
      "\n",
      "<div class=\"row\">\n",
      "    <div class=\"col-md-8\">\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n",
      "        <a href=\"/author/Albert-Einstein\">(about)</a> - <a href=\"http://goodreads.com/author/show/9810.Albert_Einstein\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"change,deep-thoughts,thinking,world\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/change/page/1/\">change</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/deep-thoughts/page/1/\">deep-thoughts</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/thinking/page/1/\">thinking</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/world/page/1/\">world</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“It is our choices, Harry, that show what we truly are, far more than our abilities.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">J.K. Rowling</small>\n",
      "        <a href=\"/author/J-K-Rowling\">(about)</a> - <a href=\"http://goodreads.com/author/show/1077326.J_K_Rowling\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"abilities,choices\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/abilities/page/1/\">abilities</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/choices/page/1/\">choices</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n",
      "        <a href=\"/author/Albert-Einstein\">(about)</a> - <a href=\"http://goodreads.com/author/show/9810.Albert_Einstein\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"inspirational,life,live,miracle,miracles\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/live/page/1/\">live</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/miracle/page/1/\">miracle</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/miracles/page/1/\">miracles</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Jane Austen</small>\n",
      "        <a href=\"/author/Jane-Austen\">(about)</a> - <a href=\"http://goodreads.com/author/show/1265.Jane_Austen\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"aliteracy,books,classic,humor\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/aliteracy/page/1/\">aliteracy</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/books/page/1/\">books</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/classic/page/1/\">classic</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“Imperfection is beauty, madness is genius and it&#39;s better to be absolutely ridiculous than absolutely boring.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Marilyn Monroe</small>\n",
      "        <a href=\"/author/Marilyn-Monroe\">(about)</a> - <a href=\"http://goodreads.com/author/show/82952.Marilyn_Monroe\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"be-yourself,inspirational\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/be-yourself/page/1/\">be-yourself</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“Try not to become a man of success. Rather become a man of value.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n",
      "        <a href=\"/author/Albert-Einstein\">(about)</a> - <a href=\"http://goodreads.com/author/show/9810.Albert_Einstein\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"adulthood,success,value\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/adulthood/page/1/\">adulthood</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/success/page/1/\">success</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/value/page/1/\">value</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“It is better to be hated for what you are than to be loved for what you are not.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">André Gide</small>\n",
      "        <a href=\"/author/Andre-Gide\">(about)</a> - <a href=\"http://goodreads.com/author/show/7617.Andr_Gide\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"life,love\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/love/page/1/\">love</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“I have not failed. I&#39;ve just found 10,000 ways that won&#39;t work.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Thomas A. Edison</small>\n",
      "        <a href=\"/author/Thomas-A-Edison\">(about)</a> - <a href=\"http://goodreads.com/author/show/3091287.Thomas_A_Edison\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"edison,failure,inspirational,paraphrased\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/edison/page/1/\">edison</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/failure/page/1/\">failure</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/paraphrased/page/1/\">paraphrased</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“A woman is like a tea bag; you never know how strong it is until it&#39;s in hot water.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Eleanor Roosevelt</small>\n",
      "        <a href=\"/author/Eleanor-Roosevelt\">(about)</a> - <a href=\"http://goodreads.com/author/show/44566.Eleanor_Roosevelt\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"misattributed-eleanor-roosevelt\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/misattributed-eleanor-roosevelt/page/1/\">misattributed-eleanor-roosevelt</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <div class=\"quote\" itemscope itemtype=\"http://schema.org/CreativeWork\">\n",
      "        <span class=\"text\" itemprop=\"text\">“A day without sunshine is like, you know, night.”</span>\n",
      "        <span>by <small class=\"author\" itemprop=\"author\">Steve Martin</small>\n",
      "        <a href=\"/author/Steve-Martin\">(about)</a> - <a href=\"http://goodreads.com/author/show/7103.Steve_Martin\">(Goodreads page)</a>\n",
      "        </span>\n",
      "        <div class=\"tags\">\n",
      "            Tags:\n",
      "            <meta class=\"keywords\" itemprop=\"keywords\" content=\"humor,obvious,simile\" /    > \n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/obvious/page/1/\">obvious</a>\n",
      "            \n",
      "            <a class=\"tag\" href=\"/tag/simile/page/1/\">simile</a>\n",
      "            \n",
      "        </div>\n",
      "    </div>\n",
      "\n",
      "    <nav>\n",
      "        <ul class=\"pager\">\n",
      "            \n",
      "            \n",
      "            <li class=\"next\">\n",
      "                <a href=\"/page/2/\">Next <span aria-hidden=\"true\">&rarr;</span></a>\n",
      "            </li>\n",
      "            \n",
      "        </ul>\n",
      "    </nav>\n",
      "    </div>\n",
      "    <div class=\"col-md-4 tags-box\">\n",
      "        \n",
      "            <h2>Top Ten tags</h2>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 28px\" href=\"/tag/love/\">love</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/inspirational/\">inspirational</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/life/\">life</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 24px\" href=\"/tag/humor/\">humor</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 22px\" href=\"/tag/books/\">books</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 14px\" href=\"/tag/reading/\">reading</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 10px\" href=\"/tag/friendship/\">friendship</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/friends/\">friends</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/truth/\">truth</a>\n",
      "            </span>\n",
      "            \n",
      "            <span class=\"tag-item\">\n",
      "            <a class=\"tag\" style=\"font-size: 6px\" href=\"/tag/simile/\">simile</a>\n",
      "            </span>\n",
      "            \n",
      "        \n",
      "    </div>\n",
      "</div>\n",
      "\n",
      "    </div>\n",
      "    <footer class=\"footer\">\n",
      "        <div class=\"container\">\n",
      "            <p class=\"text-muted\">\n",
      "                Quotes by: <a href=\"https://www.goodreads.com/quotes\">GoodReads.com</a>\n",
      "            </p>\n",
      "            <p class=\"copyright\">\n",
      "                Made with <span class='sh-red'>❤</span> by <a href=\"https://scrapinghub.com\">Scrapinghub</a>\n",
      "            </p>\n",
      "        </div>\n",
      "    </footer>\n",
      "</body>\n",
      "</html>\n"
     ]
    }
   ],
   "source": [
    "browser['username']='dotzhen'\n",
    "browser['password']='hello'\n",
    "\n",
    "response = browser.submit_selected()\n",
    "print(response.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8efff9c2",
   "metadata": {},
   "source": [
    "截图如下：  \n",
    "[![jAPAKS.png](https://s1.ax1x.com/2022/06/26/jAPAKS.png)](https://imgtu.com/i/jAPAKS)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  },
  "vscode": {
   "interpreter": {
    "hash": "9f35b62a1d17a2b36b9c54ddf6a1c189fdf51f8ec8cea898ba9fed29bc45b6fd"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
